22 research outputs found

    Machine learning-based predictions of dietary restriction associations across ageing-related genes

    Get PDF
    BACKGROUND: Dietary restriction (DR) is the most studied pro-longevity intervention; however, a complete understanding of its underlying mechanisms remains elusive, and new research directions may emerge from the identification of novel DR-related genes and DR-related genetic features. RESULTS: This work used a Machine Learning (ML) approach to classify ageing-related genes as DR-related or NotDR-related using 9 different types of predictive features: PathDIP pathways, two types of features based on KEGG pathways, two types of Protein–Protein Interactions (PPI) features, Gene Ontology (GO) terms, Genotype Tissue Expression (GTEx) expression features, GeneFriends co-expression features and protein sequence descriptors. Our findings suggested that features biased towards curated knowledge (i.e. GO terms and biological pathways), had the greatest predictive power, while unbiased features (mainly gene expression and co-expression data) have the least predictive power. Moreover, a combination of all the feature types diminished the predictive power compared to predictions based on curated knowledge. Feature importance analysis on the two most predictive classifiers mostly corroborated existing knowledge and supported recent findings linking DR to the Nuclear Factor Erythroid 2-Related Factor 2 (NRF2) signalling pathway and G protein-coupled receptors (GPCR). We then used the two strongest combinations of feature type and ML algorithm to predict DR-relatedness among ageing-related genes currently lacking DR-related annotations in the data, resulting in a set of promising candidate DR-related genes (GOT2, GOT1, TSC1, CTH, GCLM, IRS2 and SESN2) whose predicted DR-relatedness remain to be validated in future wet-lab experiments. CONCLUSIONS: This work demonstrated the strong potential of ML-based techniques to identify DR-associated features as our findings are consistent with literature and recent discoveries. Although the inference of new DR-related mechanistic findings based solely on GO terms and biological pathways was limited due to their knowledge-driven nature, the predictive power of these two features types remained useful as it allowed inferring new promising candidate DR-related genes. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-021-04523-8

    Ageing transcriptome meta-analysis reveals similarities between key mammalian tissues

    Get PDF
    By combining transcriptomic data with other data sources, inferences can be made about functional changes during ageing. Thus, we conducted a meta-analysis on 127 publicly available microarray and RNA-Seq datasets from mice, rats and humans, identifying a transcriptomic signature of ageing across species and tissues. Analyses on subsets of these datasets produced transcriptomic signatures of ageing for brain, heart and muscle. We then applied enrichment analysis and machine learning to functionally describe these signatures, revealing overexpression of immune and stress response genes and underexpression of metabolic and developmental genes. Further analyses revealed little overlap between genes differentially expressed with age in different tissues, despite ageing differentially expressed genes typically being widely expressed across tissues. Additionally we show that the ageing gene expression signatures (particularly the overexpressed signatures) of the whole meta-analysis, brain and muscle tend to include genes that are central in protein-protein interaction networks. We also show that genes underexpressed with age in the brain are highly central in a co-expression network, suggesting that underexpression of these genes may have broad phenotypic consequences. In sum, we show numerous functional similarities between the ageing transcriptomes of these important tissues, along with unique network properties of genes differentially expressed with age in both a protein-protein interaction and co-expression networks

    A data mining approach for classifying DNA repair genes into ageing-related or non-ageing-related

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The ageing of the worldwide population means there is a growing need for research on the biology of ageing. DNA damage is likely a key contributor to the ageing process and elucidating the role of different DNA repair systems in ageing is of great interest. In this paper we propose a data mining approach, based on classification methods (decision trees and Naive Bayes), for analysing data about human DNA repair genes. The goal is to build classification models that allow us to discriminate between ageing-related and non-ageing-related DNA repair genes, in order to better understand their different properties.</p> <p>Results</p> <p>The main patterns discovered by the classification methods are as follows: (a) the number of protein-protein interactions was a predictor of DNA repair proteins being ageing-related; (b) the use of predictor attributes based on protein-protein interactions considerably increased predictive accuracy of attributes based on Gene Ontology (GO) annotations; (c) GO terms related to "response to stimulus" seem reasonably good predictors of ageing-relatedness for DNA repair genes; (d) interaction with the XRCC5 (Ku80) protein is a strong predictor of ageing-relatedness for DNA repair genes; and (e) DNA repair genes with a high expression in T lymphocytes are more likely to be ageing-related.</p> <p>Conclusions</p> <p>The above patterns are broadly integrated in an analysis discussing relations between Ku, the non-homologous end joining DNA repair pathway, ageing and lymphocyte development. These patterns and their analysis support non-homologous end joining double strand break repair as central to the ageing-relatedness of DNA repair genes. Our work also showcases the use of protein interaction partners to improve accuracy in data mining methods and our approach could be applied to other ageing-related pathways.</p

    Reductions in hypothalamic Gfap expression, glial cells and α-tanycytes in lean and hypermetabolic Gnasxl-deficient mice

    Get PDF
    BACKGROUND: Neuronal and glial differentiation in the murine hypothalamus is not complete at birth, but continues over the first two weeks postnatally. Nutritional status and Leptin deficiency can influence the maturation of neuronal projections and glial patterns, and hypothalamic gliosis occurs in mouse models of obesity. Gnasxl constitutes an alternative transcript of the genomically imprinted Gnas locus and encodes a variant of the signalling protein Gαs, termed XLαs, which is expressed in defined areas of the hypothalamus. Gnasxl-deficient mice show postnatal growth retardation and undernutrition, while surviving adults remain lean and hypermetabolic with increased sympathetic nervous system (SNS) activity. Effects of this knock-out on the hypothalamic neural network have not yet been investigated. RESULTS: RNAseq analysis for gene expression changes in hypothalami of Gnasxl-deficient mice indicated Glial fibrillary acid protein (Gfap) expression to be significantly down-regulated in adult samples. Histological analysis confirmed a reduction in Gfap-positive glial cell numbers specifically in the hypothalamus. This reduction was observed in adult tissue samples, whereas no difference was found in hypothalami of postnatal stages, indicating an adaptation in adult Gnasxl-deficient mice to their earlier growth phenotype and hypermetabolism. Especially noticeable was a loss of many Gfap-positive α-tanycytes and their processes, which form part of the ependymal layer that lines the medial and dorsal regions of the 3(rd) ventricle, while β-tanycytes along the median eminence (ME) and infundibular recesses appeared unaffected. This was accompanied by local reductions in Vimentin and Nestin expression. Hypothalamic RNA levels of glial solute transporters were unchanged, indicating a potential compensatory up-regulation in the remaining astrocytes and tanycytes. CONCLUSION: Gnasxl deficiency does not directly affect glial development in the hypothalamus, since it is expressed in neurons, and Gfap-positive astrocytes and tanycytes appear normal during early postnatal stages. The loss of Gfap-expressing cells in adult hypothalami appears to be a consequence of the postnatal undernutrition, hypoglycaemia and continued hypermetabolism and leanness of Gnasxl-deficient mice, which contrasts with gliosis observed in obese mouse models. Since α-tanycytes also function as adult neural progenitor cells, these findings might indicate further developmental abnormalities in hypothalamic formations of Gnasxl-deficient mice, potentially including neuronal composition and projections

    Predicting the pro-longevity or anti-longevity effect of model organism genes with new hierarchical feature selection methods

    No full text
    Ageing is a highly complex biological process that is still poorly understood. With the growing amount of ageing-related data available on the web, in particular concerning the genetics of ageing, it is timely to apply data mining methods to that data, in order to try to discover novel patterns that may assist ageing research. In this work, we introduce new hierarchical feature selection methods for the classification task of data mining and apply them to ageing-related data from four model organisms: Caenorhabditis elegans (worm), Saccharomyces cerevisiae (yeast), Drosophila melanogaster (fly), and Mus musculus (mouse). The main novel aspect of the proposed feature selection methods is that they exploit hierarchical relationships in the set of features (Gene Ontology terms) in order to improve the predictive accuracy of the Naïve Bayes and 1-Nearest Neighbour (1-NN) classifiers, which are used to classify model organisms’ genes into pro-longevity or anti-longevity genes. The results show that our hierarchical feature selection methods, when used together with Naïve Bayes and 1-NN classifiers, obtain higher predictive accuracy than the standard (without feature selection) Naïve Bayes and 1-NN classifiers, respectively. We also discuss the biological relevance of a number of Gene Ontology terms very frequently selected by our algorithms in our datasets

    Comparing enrichment analysis and machine learning for identifying gene properties that discriminate between gene classes

    Get PDF
    Biologists very often use enrichment methods based on statistical hypothesis tests to identify gene properties that are significantly over-represented in a given set of genes of interest, by comparison with a ‘background’ set of genes. These enrichment methods, although based on rigorous statistical foundations, are not always the best single option to identify patterns in biological data. In many cases, one can also use classification algorithms from the machine-learning field. Unlike enrichment methods, classification algorithms are designed to maximize measures of predictive performance and are capable of analysing combinations of gene properties, instead of one property at a time. In practice, however, the majority of studies use either enrichment or classification methods (rather than both), and there is a lack of literature discussing the pros and cons of both types of method. The goal of this paper is to compare and contrast enrichment and classification methods, offering two contributions. First, we discuss the (to some extent complementary) advantages and disadvantages of both types of methods for identifying gene properties that discriminate between gene classes. Second, we provide a set of high-level recommendations for using enrichment and classification methods. Overall, by highlighting the strengths and the weaknesses of both types of methods we argue that both should be used in bioinformatics analyses
    corecore